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2.0 


FOREWORD 


The  investigators  have  abided  by  the  National  Insittites  of  Health  Guidelines  for 
research  involving  Recombinant  DNA  molecules  (April  82)  and  the  Administrative  Practices 
Suppliments,  as  indicated  in  the  Memorandum  of  Understanding  and  Agreement,  approved  by  the 
Institutional  Biosafety  Committee  and  N.I.H. 

Citations  of  commercial  organizations  and  trade  names  in  this  report  do  not 
constitute  an  official  Dep.'irtment  of  the  Army  endorsement  or  approval  of  the  products  or  sers'ices 
of  these  organizations. 


3.0 


INTRODUCTION 


Dengue,  a  human  disease  of  global  significance,  is  caused  by  dengue  virus,  a  member  of 
the  newly  formed  family  flaviviridae,  which  comprises  of  about  70  closely  related  enveloped 
viruses  (  Westaway  et  al,  1985).  This  group  of  viruses  contains  a  single-stranded  RNA  of  about 
11  kb  as  their  genome  with  a  positive-stranded  polarity  (Russell,  et  al.,  1980).  The  RNA  has  a 
type  I  cap  structure,  and  a  poly(A)  track  toward  the  3'  end  is  absent.  Dengue  viruses  are  of  four 
distinct  serotypes  (DEN-1  to  4)  and  are  transmitted  to  humans  principally  by  Aedes  aeevpii 
mosquitos.  In  endemic  areas  of  tropical  Asia,  apart  from  dengue  fever  (DF),  a  more  severe  form 
of  the  disease,  dengue  hemorrhagic  fever  (DHF),  occurs  in  children,  which  could  lead  to  dengue 
shock  syndrome  (DSS).  Recently  the  pathogenesis  of  dengue  was  the  subject  of  an  excellent 
review  by  Halstead  (1988). 

The  wide  geographical  occurrence  of  dengue  infections  combined  with  increasing  number 
of  epidemics  in  Central  and  South  Americas  and  the  Caribbean  is  a  cause  of  major  concern  ,  An 
effective  vaccine  is  not  available  to  protect  individuals  against  all  four  serotypes  of  DF.  The  major 
problem  associated  with  dengue  vaccine  is  that  individuals  having  protection  agtiinst  one  serotype 
are  fully  susceptible  to  infection  with  other  DF  serotypes.  More  often,  the  secondary  infection 
with  another  serotype  results  in  a  serious  form  of  the  disease,  DHF.  Moreover,  there  are 
geographical  heterogeneities  in  multiple  dengue  serotypes,  as  well  as  the  genotypic  variants  of  the 
same  serotype  (  Trent  et  al.,  1983;  Repik  et  al.,  1983;  Kerschner  et  al.,  1986;  Walker  et  al., 

1988).  Using  the  techniques  of  RNA  oligonucleotide  finger  printing  and  hybridizations  with 
synthetic  DNA  probes,  Ui  genotypic  variants  for  DEN-2,  7  for  DI^N-I  and  5  foi  DEN-3,  have 
been  characterized  (Trent  et  al.,  1983;  Repik  et  al.,  1983;  Kerschner  et  al.,  1986).  Because  of  the 
rapid  occurrence  of  variations  in  dengue  viruses,  there  is  a  need  to  obtain  the  complete  nitcleotide 
sequence  data  of  all  the  DEN  serotypes  aiid  al.so  of  liie  important  strains  of  the  same  serotype. 

This  w'ould  make  it  possible  to  relate  protein  stnicture  to  specified  surface  epitopes  and  facilitate  the 
development  of  a  recomhinant  vticcine. 
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The  first  complete  nucleotide  sequence  of  a  flavivirus  reported  was  that  of  YF  (Rice  et 
al.,  1985).  This  study  established  that  there  is  a  single  long  ORF  coding  for  a  large  polyprotein, 
which  is  then  cleaved  by  cellular  and/or  viral  proteases  to  form  the  mature  structural  proteins; 
capsid  (C);  membrane  (M);  envelope  (E);  and  nonstructural  proteins:  NSl,  ns2a,  ns2b,  NS3, 
ns4a,  ns4b,  and  NS5,  respectively.  Recently  the  nucleotide  sequence  data  for  DEN-2.S1 
candidate  vaccine  .strain  derived  from  the  PR- 159  isolate  and  the  DEN-2JAM  have  been  reported 
(Hahn  et  al,,  1988;  Deubel  et  al.,  1988).  Between  the  same  topotypes,  variations  of  about  10% 
in  nucleotide  sequences  were  noted.  In  addition,  there  were  deletions  of  20  nt  in  DEN-2S1 
compared  with  the  DEN-2JAM.  A  partial  sequence  totalling  5472  nt  of  cDNA  clones  from  DEN- 
2NGS-C  has  been  reported  (Yaegashi  et  al.,  1986;  Putnak  et  al.,  1988).  In  this  report,  we 
present  the  complete  nucleotide  sequence  of  the  genome  of  DEN-2NGS-C  strain  and  compare  it 
with  those  of  two  other  DEN-2  strains.  Our  results  indicate  that  the  DEN-2NGS-C  is  more 
similar  to  the  DEN-2JAM.  than  to  the  DEN-2S1  candidate  strain  from  the  PR- 159  isolate. 

4.0  Body  of  the  Report 

Cloning  of  the  region  of  DEN-2  RNA  encoding  the  structural  proteins. 

(a).  Rationale: 

One  of  the  overall  objectives  of  the  Contract  proposal  is  to  sequence  the  entire  dengue  2 
virus  genome.  During  the  last  ANNU.^L  REPORT  dated  November  17,  1987  f  for  the  work  done 
during  September  15,  1986-  September  14,  1987|,  the  DNA  .sequence  analysis  of  the  regions 
encoding  the  nonstructural  j'lotein  NSl,  ns2a,  ns2b,  N.S.5,  n.s4a,  ns4b,  and  up  to  528  amino  aciiis 
in  the  NS5  coding  region  was  reported  totalling  7446  nucleotides,  which  was  about  lA7c  of  the 

viral  genome.  This  report  contains  the  complete  .sequence  with  the  exception  of  only  seven 

« 

nucleotides  in  the  protein  noncoding  region.  Therefore,  the  specific  aim  #  1  of  the  Contract  is 
cs.scntially  fulfilled. 


1. 


Cell  culture,  DEN-2  virus,  and  RNA 


DEN-2  (New  Guinea  strain),  originally  isolated  in  1944  (Sabin  and  Schlesinger,  1945)  is 
the  prototype  strain  of  DEN-2  viruses.  The  virus  stock  used  in  this  study  was  passed  38  times  in 
suckling  mouse  brain,  which  was  used  to  infect  Aedes  albopictus  C6/36  cells  in  175  cm2  tissue 
cultiu'e  flasks  at  a  moi  of  <  0.5.  The  virus  particles  released  into  the  growth  medium  were 
harvested  by  ultracentrifiigation  (100,000  x  g  for  3  h)  seven  and  13  days  postinfeclion.  The  virus 
was  adsorbed  by  immunoaffinity  chnnnatography  using  the  monoclonal  antibody  4G2  raised 
.against  the  suaictural  glycoprotein  li.,  wlticli  was  linked  to  Prolein-A  Sepharose  (Sigma  Chemical 
Co.).  The  virus  particles  adsorbed  to  the  affinity  column  were  directly  dismpted  by  passing  a 
buffer  (10  mMTris.HCl,  pH  7.5,  0.1  M  NaCl,  and  1  mM  EDTA)  containing  0. 1%  SDS.  The 
viral  RNA  was  collected  in  polypropylene  tubes  (Eppendorf)  containing  chlorofomi-saturated 
phenol.  Subsequent  to  two  extractions  with  phenol,  the  RNA  was  precipitated  by  the  addition  of 
2.5  vols.  of  ethanol  and  stored  at  -70”  C  until  u.se.  The  integrity  of  the  RNA  was  checked  by 
electrophoresis  on  an  agarose  gel  and  was  found  to  be  predominantly  (>90%)  full  length  when 
isolated  by  this  procedure. 

2,  Syiilliesis  of  the  cDNA  copy  of  DEN-2  RNA 

The  nucleotide  sequence  of  DEN-2  cDNA  clones  totalling  5472  nt,  enctxiing  the 
nonstructural  proteins  NSl,  ns2a,  ns2b,  ns4b,  and  portions  of  NS3,  ns4a,  NS5  of  the  polyprotein 
precursor  was  reported  previously  (Yaegashi  et  al.,  1986;  Putnak  et  al.,  19X8).  In  order  to  clone 
the  structural  region  upstream  to  the  NS  1  region,  a  .synthetic  primer 

CGTGAATrCAdTCCTA'I'CCA'r  (  complementary  to  nt  2328-2348,  iti  Fig,  2)  was  used  for  the 
revcrse-tratiscriptase  catalyzed  cDNA  synthesis.  The  experimental  conditions  of  the  cDNA 
synthesis  were  as  described  (Yaegashi  et  al.,  1986).  Briefly,  the  DEN-2  RNA  was  denatured  with 
methylmercuric  hydroxide  (Btiiley  ;ind  David.son,  1976)  in  the  presence  of  the  primer.  Subsequent 


to  an  annealing  step,  the  cDNA  syuth'-sis  was  carried  mu  as  licscribed  (Okayama  and  Berg.  '  " 

4 


Gubler  and  Hoffman,  1983;  Maniatis  et  al.,  1982).  Following  methylation  of  HcoRI  sites,  the 
doiible-stranded  cDNA  was  ligated  to  an  EcoRl  linker,  digested  by  F.coRI,  and  size-fractionated 
by  electrophoresis  on  an  agarose  gel.  The  cDNA  fragments  were  cloned  at  the  EcoRI  site  of 
pUC18  (Vieira  and  Messing.  1982).  Alternatively,  the  blunt-ended  cDNA  fragments  were  cloned 
at  the  Pstl-cut  and  Pollk-treated  pUC18  vector.  The  transformants  were  screened  by  restriction 
enzyme  digestion.  One  cDNA  clone  of  about  2.4  kb  in  length  and  several  independent  clones  of 
various  lengtlis  were  obtained  from  the  region  upstream  to  the  primer  site.  cDNA  clones  were  also 
obtained  by  using  random  primers  for  the  synthesis  of  the  first  strand  cDNA  (  Taylor  et  al.,  1976; 
Rice  et  al.,  1981).  They  were  ordered  on  the  DEN-2  genome  by  hybridization  using  other  cDNA 
clones,  which  were  sequenced  previously  (pVVl,  pVV17,  and  pVV9;  Yaegashi  et  al.,  1986),  as 
probes. 

To  clone  the  cDNA  containing  the  3'-end  of  the  genome,  the  DEN-2  RNA  was  tailed  with 
poly(A)  using  E.  coli  poly(A)  polymerase  (  Sippel,  1973;  Gething  et  al.,  1980).  The  first  strand 
cDNA  was  synthesized  using  a  primer  containing  a  stretch  of  T  residues 
CC(’CCCGGGTCTAGA(T)i5T-OH)  to  initiate  DNA  synthesis  from  the  3'  teiminus  of  DEN-2 
RNA.  Duplex  cDNA  was  synthesized  as  described  previously  (Okayama  and  Berg,  1982; 
Gubler  and  Hoffman,  1983).  This  cDNA  library  was  used  to  amplify  the  region  containing  tlic 
3'-tcnninal  sequences  of  the  DEN-2  genome.  For  amplification,  the  chain  reaction  catalyzed  by 
T;ic|  polymcnisc  fPerkin-EImer-Cetus  Coiqt.,  CT,  U.S.A.)  was  used  (Saiki  et  al.,  1988;  Scharf  ct 
al.,  1986).  Tlie  oligodeoxynucleotides  GGACAAGTrGGTACCTAl  GG  (nt  9373-9392  in  Pig.  2) 
ami  GCCCCTCTAGAtT)]  ^T-OI I  were  used  as  primers  for  amplification  by  Tag  polymerase.  I'he 
amplified  DNA,  tifter  ti  total  of  2.6  cycles  of  dei  aturation,  annealing  and  DNA  synthesis,  was 
purified  by  electroplioresis  on  an  agarose  gel.  The  1.4-kb  DNA  fragment  was  digested  with  Kpnl 
+  Xbal  prior  to  cloning  at  the  corresponding  sites  of  p\JCI  8. 

3.  Sc(|iicneitig  nicthod.s 

For  sequencing  the  cDNA  clones,  either  the  chemical  metluxi  of  Mtixam  and  (lill\-rt  ( 168(1) 
or  the  (iidco.xy  chain  teniiination  method  ofStuigeret  al.  (1677)  \v;is  ti'Cil.  Subcloncs  from  pI’.M- 


FI 2  cDNA  were  gcnenited  by  sequential  digestion  with  cxonuclease  III  and  SI  nuelease,  followed 
by  treatment  with  Pollk  and  T4  DNA  ligase  as  described  by  Henikoff  (1984). 

(c)  Results  and  Discussion 

1.  Analysis  of  cDNA  encoding  the  structural  proteins 

Based  oti  our  unpublished  nucleotide  sequence  data  in  the  region  upstream  of  the  N~ 
tenninus  of  NS  1  previously  reported  (Putnak  et  al.,  1988),  a  synthetic  primer  complementary  to 
nt  2328-2348  of  DEN-2  RNA  in  the  C-terminal  region  of  E  glycoprotein  was  used  for  the 
cDNA  synthesis.  Subsequent  cloning  step  gave  rise  to  several  independent  cDNA  clones  of 
various  lengths  in  the  structural  region,  possibly  due  to  some  heterogeneity  in  the  population  of 
cDNA  molecules  (clones  1-6,  in  Fig.  1).  The  longest  cDNA  clone  w:is  about  2.4  kb  (  clone  4), 
whicii  appeared  to  contain  nearly  all  of  the  sequences  upstream  to  the  primer  site,  based  on  the 
compailson  with  the  nucleotide  sequence  of  DEN-2JAM  (Deubel  et  al.,  1986).  The  sequence 
analysis  of  the  structural  region  was  carried  out  on  both  strands  of  clones  4-7  (Fig.  1).  The 
.sequence  of  this  region  is  shown  in  Fig.  2  with  the  exception  of  about  seven  nucleotides  at  the 
.6'-end. 

2.  Sc<|uencc  analysis  in  the  region  encoding  the  nonstructiiral 

proteins  and  in  the  3'-tei'minal  noncoding  region 

A  partial  .sequence  in  tlie  region  encoding  NS3  was  |)rcviously  reported  (Yaegashi  et  al,, 
1986).  To  complete  the  sequence  analysis  in  N,S3  region  and  extend  our  analysis  toward  the  3'- 
terminus  of  the  DEN-2  RNA.  the  cDNA  library  w'as  sCrecncti  by  hybridization  using  the  clones 
previously  sequenced  (Yaegashi  et  al.,  1986)  as  probes.  Tlic  new'  clones  were  ordered  along  the 
genome  by  set|ucncing  at  their  temiini  iind  by  using  them  in  re.screening  the  library,  which  gave 
rise  to  the  clone';  ''16  (Fig.  I).  '  oF!:iin  the  clonc(s)  containing  the  entire  lermin.d 


sequence  of  DEN-2  RNA,  a  different  strategy  was  used.  The  poly(A)-tailed  RNA  was  u.sed  for 
cDNA  synthesis  as  described  in  MATERIALS  AND  METHODS,  Section  b.  From  the  ds  cDNA 
mixture,  the  sequences  containing  the  3'-terminal  end  including  the  poly(A)  tail  were  amplified 
using  the  Taq  polymerase-catalyzed  chain  reaction  (Saiki  et  al.,  1988;  Scharf  et  al.,  1986)  (Fig.  3) 
using  the  primers  #1,  containing  the  oligo(dT)i6  track,  and  #2  (nt  9373-9392).  This  strategy 
allowed  us  to  nplify  and  identify  the  3’-tenoinal  cDNA  clones  that  contained  the  poly(A)  tail. 

1  lowever,  in  addition  to  the  1 .4-kb  fragmeiu  expected  from  the  distance  between  the  primer  #2  (iit 
9373  in  Fig.  2)  and  the  3'-terminus  based  on  the  data  of  Hahn  et  al.  (1988),  two  additional  major 
DNA  fragments  of  about  ().4-kb.  ().8-kb  in  length  (Fig.  3A,  lane  2)  were  also  obtained.  The 
possibility  that  the  generation  of  additional  DNA  fragments  (  major  fragments  of  0.4-kb,  0.8-kb, 
and  other  minor  fragments  )  is  unique  to  the  use  of  primer  #1  in  the  PCR  reaction  was  verified  as 
follows.  A  different  primer  AGAACCI’GlTGAlTCAACAGCACCcomplement.'U'y  to  the  3'- 
tenninal  sequence  of  DEN-2S1  genome  (Flahn  et  al.,  1988)  (primer  #3),  was  substituted  for  the 
oligo(T)-containing  primer  #1  in  the  PCR  reaction,  and  a  single  band  of  1,4-kb  was  obtained  (Fig. 
3B,  lane  2).  It  was  funher  supported  by  the  fact  that,  when  the  purified  1. 4-kb  DNA  fragment 
from  the  PCR  reaction  product  was  used  for  the  second  set  of  PCR  reaction  cycles  using  the  same 
primers  #1  and  #2,  an  identical  pattern  of  additional  DNA  fragments  was  produced  (Fig  3A,  lane 
3),  confinning  that  these  DNA  fragments  were  the  products  uniciue  to  the  primer  #1  in  the  PCR 
reaction,  possibly  arising  from  its  annealing  to  other  sites.  The  origin  of  these  spurious  DNA 
fragments  were  not  further  investigated.  Subsequent  cloning  of  the  1.4-kb  DNA  fragment  gtive 
rise  to  several  transformants,  'I'hree  clones  (clones  18-20  in  Fig.  1)  containing  inserts  of  about 
1.4-kb  in  length  vve  >■  selected  for  further  characterization.  Sequence  analysis  from  their  tennini 
revciiled  the  presence  of  the  poly(A)  tail  of  22-26  nt  in  length. 

It  was  reported  that  the  jiroducts  of  30  cycle-F’CR  amplifications  contained  a  total  of  17 
misincorporations  consisting  of  transitions  and  transversions  distributed  randomly  throughout  28 
septirate  clones  of  239  bp  DNA  (Saiki  et  :il..  1988).  The  overall  error  frequency  ofTaq  polymenise 
in  tliis  ca.se  was  0,2.8%  ,  alihougii  the  actual  rato  <  h' p;isincor|)oration  per  nucleotide  [iercvc'>'  ;  > 


estimated  at  2  x  lO"'^  (vSaiki  et  al.,  1988).  The  sequence  data  derived  from  the  conventional  cDNA 
clones  16  and  17  extended  up  to  nt  10,250,  The  sequence  data  for  the  region  from  nt  10,200  to  the 
3'-terminus  was  derived  from  the  PCR  clone  20.  There  are  six  nucleotide  differences  noted 
between  DEN-2NGS-C  and  DEN'2JAM  in  this  region.  The  possibility  that  some  of  the.se 
differences  were  due  to  the  error  frequency  of  Taq  polymerase  could  not  be  ruled  out. 

3  .  Organization  of  DEN-2  genome 

riie  complete  sequence  of  the  DEN-2NGS-C  genome,  with  the  exception  of  alx>ut  seven 
nt  from  the  5'-noncoding  region,  ixised  on  the  compiirison  with  that  of  DEN-2JAM  (Deubel  et  al., 

1986) ,  is  shown  in  Fig.  2.  It  includes  the  previously  published  data  (Yaegashi  et  al.,  1986; 
Putnak  et  al.,  1988)  and  is  10,723  nt  in  length,  which  is  identical  to  that  of  DEN-2JAM.  It  is  20 
nt  longer  than  DEN-2S1  genome.  The  ba.se  composition  is  very  similar  to  the  other  DEN-2  strains 
(Vezza  et  al..  1980;  Hahn  et  al.,  1988;  Deubel  et  al.,  1988)  (data  not  shovn).  Compari.son  of  the 
sequence  of  DEN-2NGS-C  indicates  that  the  genomic  organization  of  the  virus  is  simihu-  to  that 
of  other  flaviviruses,  such  as  YF  (Rice  et  al.,  1985),  WN  (Castle  et  al.,  1985;  1986),  DEN-4 
(Zhao  et  al„  1986;  Mackow  et  al..  1987),  PR-159  isolate  of  DEN-2S1  strain  (Hahn  et  al.,  1988), 
DEN-2  strain  1409  isolated  in  Jamaica  in  1983  (DEN-2;AM  (Ducbel  et  al.,  1988),  JE 
(Sumiyoshi  et  al.,  1987),  and  Kunjin  (Coia  et  al.,  1988).  The  length  of  the  5'-  and  3'- 
nontranslatcd  sequences  are  identical  to  that  of  DEN-2  JAM  strain,  being  96  and  454  nt, 
re.specti\'ely.  The  sequences  of  the  5'-nontran.slated  .segments  of  DEN-2NG,S-C  and  DEN-2JAM 
are  identical.  Between  DEN-2NG.S-C  and  DEN-2S1,  there  are  four  ntucleotide  differences  in  the 
5'-noncoding  region.  In  the  region  encoding  N.S3,  there  are  nine  additional  nucleotides  in  DEN- 
2NG,S-C.  .similar  to  the  difference  between  DEN-2JAM  :ind  DEN-2.S1  (Deubel  et  ;d,,  1988).  In 
addition,  in  the  3'-noncixiing  region  of  DEN-2NGS-C,  there  are  1 1  additional  nucleotides, 
compared  with  that  of  DFN-2,S  1  strain;  and  it  is  more  divergent  in  the  3'-dist:il  half  of  tlie 
noncoding  region  than  in  the  3'-proximal  half.  The  3'-terminal  79  nt  of  a  inimberof  flavivirus 
genomes  have  been  shown  to  have  tlie  potential  to  form  a  hairpin  loop  structure  (  Hahn  et  ah. 

1987) .  J'his  is  consisicii!  witli  the  notion  that  the  3'-proximal  half  of  the  genome,  which  is 


conserved  even  among  evoliitionarily  distant  flavivinises,  may  be  involved  in  replication  (  Rice  et 
al.,  1985;  Brintori  et  al.,  1986;  Wengler  and  Castle,  1986;  Zhao  et  al.,  1986;  Takegami  et  ah, 
1986). 

4.  Deduced  polyprotein  sequence  of  DEN-2NGS-C  genome  and 

its  cleavage  sites 

The  translated  .sequence  of  the  genome  as  shown  in  Fig.  2  indicates  tliat  one  long  ORF 
encodes  .o"91  aa  residue'^.  'I'he  cod<^ii  usage  is  non-random,  as  noted  by  other  investigators,  and 
is  very  simiiar  to  the  other  DE;N-2  stiains  (data  not  shown).  The  order  of  the  gene  products  in  the 
structural  region  is  tlie  capsid  C,  precursor  of  the  membrane  glycoprotein  prM  processed  to  M,  and 
the  envelope  jjrotein  fi,  which  is  followed  by  the  nonstructural  proteins,  NSl,  NS'2A,  NS2B, 
N.S?,  n.s4a,  NS4B  and  NS.5.  This  order  was  originally  established  for  YF  by  Rice  et  a). (198.5,  and 
recently  nt.’dined  with  respect  to  the  location  of  KS2A  and  NS4R  (Speigfa  et  a!.,  1988).  The 
assignmeiu  of  the  cleavage  sites  indicated  in  Fig.  2  are  based  on  the  data  m-m  the  direct  N- 
tciminal  luiiino  acid  sequenciitg  of  these  proteins  isolated  from  DEN-2  virions  for  E  (Bell  et  al., 
198.5).  or  from  the  DF.N-2-infected  cells  for  NSl.  NS3  and  F'S5  (Biedr/\cka  et  al.,  1987),  or  by 
homology  with  the  cstabli.sbed  cleavage  sites  of  YI-  (Rice  ct  al.,  1985),  W.N  (Castle  et  a)..  1985; 
1986;  V.'englcr  et  ah,  1985).  and  KUN  cncoded  proteins  (.Speight  et  al.  1988). 

■fhe  C  protein  contains  16  R  and  10  K  residue:;  (about  20%  of  (he  protein),  whicii 
probably  account  for  its  affinity  to  the  viral  genome  (Rice  ct  a!.,  1985).  The  initiating  M  residue 
of  C  protein  is  probably  removed  by  the  cellular  methionine  peptidase,  although  this  step  is  not 
well  diar.icterized.  The  C-terminai  domains  of  ('.  M  and  E  arc  hydro[rhub!c.  cadi  of  wliidi 
probably  .'.erves  a.'i  a  signal  .sequence  for  the  insertion  of  the  respective  [irotein  that  follow  s  (prM. 
F,  and  NSl,  respectively)  across  the  membrane  and  into  the  lumen  of  endoplasmic  reticulum, 
where  it  is  cleaved  by  the  host  signal  peptidase.  The  .sequences  V-M-A  and  V-Q-A  conform  to  tlie 
consensus  site  proposed  by  von  Heijnc  (1985;  1986')  for  cleavage  by  the  cellular  signal  pepdidase, 
and  niiglit  be  involved  in  generating  the  N-terminus  of  prM  and  N.S  1 ,  respectively.  Tlie  cleavage 
at  the  prM-M  junction  occurs  as  a  late  step  in  virus  matura.iion  i.'^lcipiro,  ei  a.l,.  197.5).  'Fhe  prM 


contains  one  putative  N-glycosylation  site  which  is  not  present  in  the  mature  M  protein.  The 
sequence  of  four  residues  preceding  the  cleavage  site  of  Fi  glycoprotein,  P-A-Y-vS  is  conser\'ed  in 
KUN,  WN,  MVP,  SLE,  JE,  and  YF  (Trent  ct  al.,  1987).  But  in  DEN-2  strains,  it  is  P-S-M-T, 
which  diverges  to  P-S-M-A  in  DEN-1  (Mason  et  al.,  1987),  or  to  P-S-Y-G  in  DEN-4  (Zhao  et  al., 
19£6). 

'I’he  locations  and  identities  in  the  polyprotein  sequence  of  NS2A,  NS2B,  NS3,  NS4B, 
and  NS.S  were  recently  established  by  partial  N-terminal  amino  acid  .seciuences  of  five  KUN 
nonstructural  proteins  (Speight  ct  al.,  1988).  d'he  sites  assigned  for  NS2A  ;ind  NS4B  of  KUN  arc 
upstream  to  those  originally  proposed  for  the  corresponding  YF  proteins  (Rice  et  ah,  198.S),  and 
are  also  present  in  the  corresponding  positions  of  WN  (Castle  et  ah,  1986),  MVE  (Dalgariu)  et  ah, 
1986),  and  SLE  (Trent  et  ah,  1987)  viruses.  They  confonn  to  the  consensus  sequence  proposed 
for  clciivagc  by  the  host  signal  peptidase,  V-X-A  (von  Heijne,  1985;  1986),  rather  than  by  the 
putative  viral  protease  originally  assigned  for  these  cleavages  (Rice  et  ah,  1985).  In  the  case  of 
pEN  viruses,  the  potential  cleavage  sites  that  would  generate  the  N-temiini  of  NSl  and  NS2A, 
conforming  to  the  consensus  sequence  of  the  type  V-X-A,  are  consers'cd  in  all  three  DEN-2 
strains,  c.xccpt  that  no  stop  transfer  or  translocation  sequences  occur  upstream  to  this  signal  for 
NS2A  (Coia  et  ah,  1988).  Moreover,  for  NS4B  of  DEN,  this  site  (T-M-A)  (see  Fig.  4)  does  not 
strictly  conform  to  the  "-3  to  - 1  ”  rule  precedirtg  the  cleavage  site,  as  identified  for  KUN  vims 
(Speight  et  ah,  1988),  although  the  first  four  amino  acids  at  the  i)utative  N-terminus  of  NS4B  are 
identical  in  all  DEN  viru.ses  so  fare.xamined,  as  well  as  in  KUN  and  WN  viruses.  Interestingly, 
the  sequence  of  three  amino  aciils  preceding  T-M-A  is  V-A  A  d  ig,  41,  which  conibrm  to  "-3  to 
- 1 "  rule.  So  it  is  jiossible  the  reipiirements  for  cleavage  by  signalase  ate  not  absolute,  and  it 
remains  to  lx*  seen  whether  this  nearby  scc|uence  might  bo  able  to  serve  for  signalase  recognition. 
The  cleavage  sites  preceding  the  N-termini  of  NS2B,  NS3,  ns4a,  and  NS5  are  the  same  as 
originally  assigned  by  Rice  ct  ah  (1985).  In  general,  they  contain  a  pair  (or  a  cluster)  of  basic 
amino  acids,  followed  by  a  short  chain  amino  acid  residue,  which  are  probably  recogni/cd  Iw  a 
viral  encoded  protease.  The  location  of  the  N-terminus  of  the  hyjH)thetii'al  ns4a  and  its  sei|uence  a 


its  put.'uive  cleavage  site  arc  tentative  for  any  flaviviriis  polyprotein.  It  is  based  on  the  measured 
size  of  NS3  and  the  occurrence  of  tlie  pair  of  basic  amino  acids,  followed  by  a  short  chain  amino 
acid. 

Previously  published  data  (Yaegashi  et  al.,  1986)  on  the  comparison  of  the  amino  acid 
sequences  between  YF  and  DEN-2NGS-C  in  the  region  encoding  the  nonstnictural  proteins 
revealed  that  these  amino  acid  sequences  are  much  less  conserved  except  for  NS3  and  NS5, 
consistent  with  their  postulated  role  in  viral  replication.  Three  regions  of  NS3  were  .shown  by  Rice 
et  al.  (1986b)  to  share  some  .similarities  with  regions  of  RNA-dependent  RNA  polymerases  of  ten 
positive-stranded  RNA  genomes.  Although  the  primary  amino  acid  sequences  are  less  conserved 
among  different  serotypes  of  DEN  viruses,  and  more  so  among  members  of  different  serological 
groups,  the  hydrophobicity  plots  of  all  these  flavivirus  genomes  are  strikingly  similar  (data  not 
shown),  suggesting  a  common  function  for  the  viral-coded  proteins. 

5.  Compari.son  of  DEN-2NGS-C  genome  with  those  of  other 

DEN  viruses 

Nucleotide  divergence  between  tlie  three  DEN-2  strains  was  determined.  The  results 
shown  in  Table  1  indicate  that  there  are  a  total  of  8.36  nt  changes  consisting  of  749  transitions  and 
87  tran.svcr.sions  between  DEN-2NGS-C  and  DEN-2S1  (7.8%).  However,  there  are  only  82  aa 
(2.4%)  changes,  20  in  the  structural,  and  62  in  the  nonstnictural  proteins.  On  the  otherhand,  DEN- 
2NGS-C  and  DEN-2JAM  strains  are  more  clo.sely  related,  as  there  are  only  a  total  of  489  nt 
changes  (4.67c)  comprising  394  transitions  and  9.6  transversions,  which  resulted  in  .68  aa  changes 
(1 .7%).  The  nucleotide  sequence  identities  between  DFN-2  and  l)EN-4,  or  between  DFN-2  and 
D!:N-1  are  considerably  less,  indicating  that  these  .serotypes  of  DF.N  viruses  diverged  from  r)FN-2 
strains  much  earlier. 

Fig.  4  shows  the  alignment  of  the  deduced  amino  acid  sequence  of  DEN-2NGvS-C  strain 
with  those  of  DEN-2.1AM  (Deube!  et  al.,  1988),  DEN-2.S  1  (Hahn  et  al.,  1988),  DEN-4  (Zhao  et 
al.,  1987),  and  a  partial  atiiino  acid  sequence  of  DEN- 1  (Mason  et  al.,  1987),  The  differences  in 

•I  '' the  N.S2A  to  80%  in  the 


the  iimiiK'  acid  sciiucnces  hctw'cen  DEN-2  and  DEN  1  range 


NS4B  protein  (Table  II).  The  overall  similarity  between  DEN-2  and  DEN-4  is  only  6ST,  and 
between  DEN-2  and  DEN-1  in  llie  regions  sequenced  (C,  prM  [M],  E,  and  NSl),  it  is  69%. 

Recently,  the  nucleotide  sequence  of  the  region  encoding  the  stnictural  proteins  of 
DEN-2NGS-C,  which  was  originally  derived  from  Queensland  Institute  of  Medical  Research 
(DEN-2NGS-C-Q1MR)  was  published  (Gruenberg  et  al.,  1988).  Comparison  of  these  data  with 
ours  reveals  that  the  two  sequences  are  e.ssentially  identical  e.scept  for  a  few  differences,  which 
are  as  follows.  The  sequence  of  ttie  first  76  nt  in  the  5'-noncoding  region  was  not  reported  by 
Gruenberg  et  al.  (198S).  In  addition,  within  the  structural  region  reported,  there  are  three  amino 
acid  changes  resulting  from  nucleotide  changes  at  second-codon  positions.  The  amino  acid 
residues  at  positions  171,  327,  and  734  of  the  polyprotein  sequence  are  K,  E,  and  I  in  our  study 
(Fig.  2),  whereas  they  are  R,  K,  and  T,  respectively  in  the  report  published  (Gnienberg  et  al., 
1988).  Two  independent  clones  were  sequenced  on  both  strands  in  our  study,  and  they  were 
identical  in  having  these  three  changes  in  the  amino  acid  residues.  Therefore,  the.se  differences 
between  the  sequences  could  be  attributed  to  the  viral  RNA  resulting  from  different  passage 
history'  and/or  changes  due  to  error  by  the  reverse  transcriptase  during  cDNA  synthesis.  Similar  to 
the  report  of  Hahn  et  at.  (1988),  we  have  also  observed  a  number  of  clonal  variations  in  our 
sequence  analysis.  For  e.xample,  the  aa  #  451  is  T  in  one  cDNA  clone  (clone  4)  and  1  in  an 
independent  cDNA  clone  (clone  6).  Similarly,  the  amino  acid  residues  at  #2391  and  2392  are  D 
and  G  in  clone  15,  and  G  and  R  in  clone  13  (Yaegashi  et  al.,  1986).  Since  some  of  these 
variations  are  at  sites  which  are  highly  conserved  among  the  different  DE..N'  viruses,  the  conserved 
amino  acid  residues  were  chosen  (for  example,  the  amino  acid  residues  ai  positions  451. 2391 
and  2392). 

6.  Conservation  of  glycosylation  sites  and  cysteine  residues 

The  potential  sites  of  glycosylation  in  E  or  in  tl/e  non.sinictural  proteins  other  than  NS  1  ai  c 
not  relatively  con.served  among  the  various  flavivinises.  The  single  glycosylation  site  of  prM  at 
Asn-69  is  conserved  in  all  DEN  viruses  (Fig.  4),  but  is  not  conseiwed  in  YF  or  WN-MLV-SLE 
subgroup  (Hahn  cl  al..  IbSS),  'Che  glycosylation  sites  of  NS  1  at  .Asn- 1 30  atid  .Asn-  20" 


conserved  in  all  DEN  strains  (Fig.  4),  as  well  as  in  the  NS  1  of  other  flaviviruses  (Piitnak,  ct  al., 
1988;  Hahn  et  al.,  1988),  suggesting  that  glycosylalion  might  be  important  for  its  function.  'I'hcy 
are  present  at  identical  location  in  the  protein,  except  in  YF,  one  of  them  is  shifted  by  one  residue 
(;ia  #208)  (Hahn  et  al.,  1988).  As  noted  by  others  (Rice  et  al.,  1986b),  the  cysteine  residues  are 
highly  conserved  in  the  structural  proteins  and  NS  I  of  all  DEN  strains  and  other  flaviviruses  so  far 
.sequenced,  with  the  exception  of  a  single  C  residue  in  the  NSl  of  DEN-4  (corresponding  to  C! 
resitlue  #  1087  of  DF.N-2Nf  iS-C'  in  f  ig.  1.  which  is  substiiuicd  l^y  V  in  DlvN-4). 


Coiitiiisions 


(d). 

The  sequence  data  for  the  prototype  DEN-2NGS-C  provides  additional  infonnation  regarding  the 
evolution  of  the  geographically  distinct  isolates  of  the  same  DEN  serotype.  These  comparative  data 
point  out  that  the  DEN-2NGS-C  isolated  in  1944  and  the  DEN-2JAM  isolated  in  1983  have 
undergone  very  little  divergence  (<2%),  compared  with  an  attenuated  strain  of  DEN-2  isolated  in 
1969  in  Puerto  Rico.  Many  conserved  amino  acid  suhslituiion.*  are  lueseni  in  the  siniciural  and 
nonstructunil  protein  domains  of  the  three  DEN-2  strains.  Future  studies  directed  toward 
examining  the  differences  in  these  domains  would  be  expected  to  provide  valuable  insight  into  the 
relationship  between  the  structure  and  function  of  these  viral  proteins,  once  a  suitable 
mammalian  expression  system  is  established. 
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Figure  Legends 


Fig.  1.  Sequencing  strategy  of  DEN-2\GS-C  genome. 

The  various  cDNA  clones  and  their  map  positions  with  respect  to  the  viral  genome  are 
shown.  The  numbers  1-20  refer  to  the  clones,  p74-A28  (1),  p72-A13  (2),  p72-C15  (3),  pKT2.4 
(4),  pKTl.8  (5),  pKTl.6  (6),  pRP-2  (7),  pVV9  (8),  pVV18  (9),  pVVl  (10),  pYS505  (11),  pPM- 
A10(12),  pVV17  (13),  pYS-132  (14),  pKT-A4(15),  pPM53  (16),  pPM-FI2  (17),  pPM-PCRI 
(18),  pPM-PCR2  (19),  and  pPM-PCR3  (2(0,  respectively.  The  nucleotide  sequence  of  the  cDNA 
clones pRP2,  pVV9,  pVV18,  pVVl,  and  pVV17  have  already  been  published  (Yaegashi  ct  al., 
1986;  Putnak  et  al.,  1988).  cDNA  clones  (clones  18-20)  tu’e  derived  from  the  poly(A)-tailcd  RNA 
and  subsequent  amplification  by  the  PCR  reaction.  Sequencing  was  carried  out  by  either  the 
dideoxy  chain  termination  method  (Sanger  et  al.,  1977)  (dotted  arrows),  or  by  the  chemical  method 
of  Maxam  and  Gilbert  (1980)  (solid  arrows).  The  solid  arrows  refer  to  sequencing  in  the  3'-5' 
direction,  and  those  preceded  by  dots  represent  sequencing  in  the  5'-3'  direction.  Subclones  for 
pPM-F12  were  generated  by  the  method  of  Henikoff  (1984). 

Fig,  2.  Composite  nucleotide  sequence  of  DEN-2NGS-C  derived  from  cDNA  clones. 

The  nucleotide  sequences  of  the  cDN.A  clones  shown  in  Fig.  1  overlapped  with  the 
previously  reported  sequences  for  NSl,  ns2a,  ns2b,  ns4b,  and  portions  of  NS3,  ns4a,  and  NS5 
(Putnak  ct  al.,  1988;  Yaegashi  et  al.,  1986).  The  complete  sequence  with  the  exception  of  alxnit 
.seven  nucleotides  at  the  .“i'-noncoding  region,  based  on  the  comparison  with  DEN-2JAM  (Denbel 
ct  al.,  1986).  is  shown  along  with  the  deduced  amino  acid  secjuence  of  the  polyproiein  piccursor. 
The  cotifirmed  N-linked  glycosylation  sites  are  lx)xed  and  the  potential  ones  are  circled.  The 
nomenclature  of  the  viral  proteins  originally  proposed  by  Rice  et  al.  (1985)  and  recently  modified 
by  Speight  et  al.  (1988)  for  NS2A,  NS2B,  and  NS4B  Ls  followed.  The  horizontal  an  ows  indicate 
the  start  points  of  these  viral  proteins.  The  cleavage  sites  for  the  generation  of  the  N-terminus  of 
the  various  proteins  are  based  on  the  partial  amino  acid  sequencing  of  E  (Bell  ct  a!.,  1985),  and 
NSl,  NS3,  and  NS5  (Biedrzyeka  et  al..  1987).  (iron  the  homology  with  other  flaviviruses  (Rice 


et  al.,  1985;  Castle  et  al.,  1985;  1986;  Speight  et  al.,  1988)  (see  section  d  of  RESULTS  AND 
DISCUSSION). 

Fig.  3.  Amplification  of  the  3'-terminal  cDNA  clone  by  polymerase  chain  reaction. 

Panel  A:  DEN-2  RNA  was  tailed  with  poly(A)  using  E.  coli  polv  (A)  polymerase  (  Sippel, 

1973;  Gething  et  al.,  1980).  The  first  strand  cDNA  was  synthesized  using  a  primer  containing  a 
stretch  of  T  residues  and  potential  to  form  Smal  and  Xbal  sites  (  Primer  #1; 
CCCCCGGGTCTAGA(T)i5T-On),  and  the  second  suand  of  cDNA  as  described  (Okayama  et  al., 
1982;  Gubler  and  Hoffman,  1983).  For  amplification,  chain  reaction  catalyzed  by  Tag  polymerase 
(Perkin-Elmer-Cetus  Corp.,  CT,  USA)  was  used.  Primer  #1  (see  above)  and  the 
oligodeoxynucleotide  GGACAAGTTGGTACCTATGG  as  primer  #2  (nt  9373-9392  in  Fig.  2) 
were  used  in  a  reaction  mixture  (100  pi)  containing  10  niM  Tris.HCl,  pH  8.3,  1.5  mM  MgCl2, 

50  niM  KCl,  0.1%  gelatin  (w/v),  dNTPs  (200pM),  IpM  each  of  the  primers  and  2.5  U  of  Tag 
polymerase.  The  sample  was  overlaid  with  mineral  oil  to  prevent  evaporation.  The  sample  was 
incubated  sucessively  for  one  min  at  94^  C,  2  min  at  37°  C  and  3  min  at  72°  C,  and  this  cycle  was 
repeated  25  times.  Subsegucnt  to  the  reaction,  the  sample  was  loaded  on  to  an  agaro.se  gel  (1%)  and 
electrophoresed.  The  gel  was  stained  with  ethidium  bromide  and  photographed.  .A.  Lane  1,  X.  DNA 
digested  with  Hindlll  and  u.sed  as  size  markers;  the  bands  from  top  to  bottom  have  sizes  of  23- 
kb,  9.7-kb,  6.6-kb,  4.3-kb  ,  2,3-kb,  2,1 -kb,  and  0.56-kb;  lane  2,  PCR  reaction  product  after  25 
cycles;  the  sizes  of  the  three  major  bands  from  top  to  bottom  are  L4-kb,  0.8-kb,  and  0.4-kb, 
respectively;  lane  3,  .second  PCR  reaction  performed  using  the  primers  #  1  and  #  2  and  tlr  1.4-kli 
fragment  purified  from  lane  2  as  the  template. 

Panel  B;  PCR  reaction  carried  out  using  the  primer  #  2  (see  above)  and  the  primer  #  3, 

I  AGAACCTGTTGATrCAACAGCACC,  which  is  complementary  to  the  3'-teiTninal  segucnce  of 

DEN-2  RNA  (1  lahn  et  al.,  1988)  and  the  cDNA  mixture  that  was  used  in  the  experiment  in  lane  2. 
j  The  size  of  the  single  band  is  about  1.4  kb. 

I  Fig.  4.  Alignment  of  tlie  complete  amino  acid  sc(|ucnces  of  DEN  N'inise^ . 

i 
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The  amino  acid  sequences  of  DEN-2NGS-C,  DEN-2JAM  (Deubel  ei  al.,  1988), 
DEN-2S1  from  the  PR-159  isolate  (Malm  et  al.,  1988),  DEN-4  (  Zhao  et  al.,  1986;  Mackow  et  al., 


1987),  and  DEN-1  (Mason  et  al.,  1987)  are  compared.  The  dots  indicate  identical  amino  acid 
residues.  The  horizontal  arrows  represent  the  start  points  of  the  variotis  viral  proteins  as  shown  in 
Fig.  2. 

Legend  to  Tables 

TABLE  1.  Divergence  in  luicleolide  sequences  among  Dengue  2  strains 

‘•Using  the  nucleotide  sequence  of  DEN-2NGS-C  as  the  reference,  DEN-2.IAM 
(JAM;  Deubel  et  al.,  1988)  and  DEN-2S1  (PR/SI;  Hahn  et  al.,  1988)  are  compared  to  calculate  the 
number  of  transitions  (purine--purine,  or  pyrimidine  pyrimidine)  and  trnn.sver.sions  (purine  -- 
pyrimidine,  and  vice  versa)  in  the  regions  encoding  the  structural  and  nonstnictural  proteins. 
TABLE  11.  Divergence  in  amino  acid  sequences  among  Dengue  viruses 

^The  amino  acid  sequences  of  DEN-2NGS-C  is  used  as  the  reference,  and 
compared  with  DEN-2JAM  (Deubel  et  al.,  1988),  DEN-2S1  (Hahn  et  al.,  1988),  DEN-4  (Mackow 
et  al.,  1987)  and  DEN- 1  (Mason  et  al.,  1987).  The  total  The  total  number  of  amino  acid  residues  in 
each  of  the  structural  and  nonstructunil  proteins  of  DEN-2NGS-C  are  shown.  F  Proteins  encoded 
by  the  virus  refer  to:  C,  Capsid;  prM,  precursor  of  membrane  protein,  M;  E).  envelope 
glycoprotein;  NSl,  NS2A,  NS2B,  NS3,  ns4a,  NS4B,  and  N,S5  are  the  nonstnictural  proteins. 
length  of  each  protein  as  number  of  amino  acids  is  given.  The  number  of  dissimilar  amino  acid 
residues  and  the  %  identities  are  also  shown.  The  values  were  calculated  from  the  alignment  of  the 
amino  acid  residues  of  the  various  Dl-N  viruses  from  f-ig.  4. 
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Table  I 

Divergence  in  nucleotide  sequences  among  DEN-2  strains^ 
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Table  II 

Divergence  in  amino  acid  sequences  among  DEN-2  strains 
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